Preferential Exchange: Strengthening Connections in Complex Networks. 
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Many social, technological and biological interactions involve network relationships whose outcome 
intimately depends on the structure of the network and on the strengths of the connections. Yet, 
although much information is now available concerning the structure of many networks, the strengths 
are more difficult to measure. Here we show that, for one particular social network, notably the 
e-mail network, a suitable measure of the strength of the connections can be available. We also 
propose a simple mechanism, based on positive feedback and reciprocity, that can explain the 
observed behavior and that hints toward specific dynamics of formation and reinforcement of network 
connections. Network data from contexts different from social sciences indicate that power-law, 
and generally broad, distributions of the connection strength are ubiquitous, and the proposed 
mechanism has a wide range of applicability. 

PACS numbers; 05.65.-|-b, 89.75.-k 



Networks are the niost general framework to describe 
technological, biological, social and other systems. The 
nodes of the network (Internet routers jd,] , Web pages , 
proteins 0, species 0, companies U^and so on) are 
linked by connections that are present or absent depend- 
ing on the node relations we are interested in. In the case 
of the Internet and of the WWW what is a connection is 
clear, being cables or hyper-links. In other cases connec- 
tions can depend on the definition: for example, we may 
say that proteins interact if they physically stick to each 
other, or if one of the two promotes the expression of the 
other. Species interact by predation in food-wcbs and in 
the case of companies one possible relation is given by 
the companies' portfolio. Social relations between indi- 
viduals can be of many kinds and purposes, from busi- 
ness 0, 0, 13 to mutual assistance to friendships and 
others. The choice of the type of relation defines the net- 
work and its structure, but we need also the strength of 
the connections to fully characterize the network. In the 
social context, for example, the strength of a relation is 
important to determine which is the best route to pass 
information to or gather information from somebody else 
in the system. Strong social ties may be regarded as pref- 
erential and reliable information channels. 

All the above networks present the small-world prop- 
erty, i. e. the average distance between nodes grows only 
logarithmically with the size of the network. As such, 
small-world networks are usually considered optimal to 
distribute or collect information. Yet, whenever some of 
the connections become unreliable, the effective average 
distance can become rather large Tl]. Under this re- 
spect the reliability of a connection, and ultimately the 
robustness of the network can be assessed by the strength 
of various connections. The most recent studies indeed 
complement the attention to the network topology with 
an investigation on the weights of edges 0,0 ■ Yet, al- 
though the weights of the connections are clearly very 



important, their determination is a difficult task. Indeed 
it is relatively easy to decide whether two individuals 
are connected or not (since the existence of a link be- 
tween them is essentially a binary variable). Instead it 
is much more difficult to quantify the strength of such 
a connection. How can we measure in an objective way 
how much two people are, for example, friends to each 
other? Here we show that for e-mail networks (a partic- 
ular instance of social network) such a measure is possi- 
ble. We believe that this example provides clues on the 
mechanism by which the network connections form, de- 
velop and strengthen. We also introduce a model, based 
on the idea of preferential exchange, whose applicability 
can in principle be extended to other contexts. 

Modern computer networks are inherently social net- 
works, since they link people and organizations and allow 
the exchange of information and communications [12| . 
In particular the exchange of e-mails between people 
defines a paradigmatic example of computer-supported 
social network that is the object of many recent stud- 
ies 0, 0, 0, 0, 01 . In e-mail networks a link between 
two people is established whenever they exchange an e- 
mail (or a threshold number of e-mails (14] ) . By browsing 
the e-mail folders of an individual (each folder represents 
a different e-mail sender), it is easy to check that, after 
a few years, the number of connections for the average 
person can grow to the hundreds. A careful analysis of 
the network is therefore necessary to reveal the presence 
of groups with common interests and purposes and the 
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hierarchical organization of these groups 

We introduce an objective measure of the strength of 
the relations by keeping track of the number of e-mails 
received from a given sender in somebody's e-mail direc- 
tory. The data sets that we analyze are five e-mail direc- 
tories coming from our accounts and the accounts of two 
other colleagues. They contain 5628 e-mails (correspond- 
ing to 393 senders) collected over roughly three years, 
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19219 e-mails (476 senders, ten years), 16102 e-mails (113 
senders, three years), 13385 e-mails (516 senders, five 
years) and 21782 e-mails (207 senders, five years). FigQ] 
shows the normalized histograms of the number N{k) 
of people who wrote k e-mails to us and our colleagues. 
As it can be seen, they are quite similar, and they can 
be approximated by an algebraic behavior of the kind 
N{k) ~ k~'^ with 7 ^ 1.6. Although, of course, the five 
datasets contain some common acquaintances, they are 
mostly uncorrelated, so that we consider them to be well 
representative of the same universal behavior. 

An algebraic law, rather than a simple exponential, 
is usually a symptom of the presence of some form of 
correlations in the dynamical process that produced the 
data. How do correlations arise in this context? A very 
simple mechanism that is known to produce such corre- 
lations is a form of positive feedback that, in the social 
context, can be described as "good partners become bet- 
ter partners" . Stated otherwise, there is a reinforcement 
mechanism such that if the relation between two people 
is already strong, it has more chances to become even 
stronger. 

To check whether this mechanism allows for the cre- 
ation and reinforcement of social links in such a way to 
reproduce the empirical data, we have analyzed a very 
simple model. Starting from a society of 5*0 individu- 
als, at every time-step each of them sends to the others 
Mout e-mail messages, at random. The network of ac- 
quaintances grows in time, and at every time-step a new 
individual enters the society. The probability that indi- 
vidual j sends a message to individual i is proportional 
to the number k{i — > j) of e-mails that j ever received 
from I, that is 



p{j -> i) 



(1) 



(the sum in the denominator is the total number of e- 
mails ever received by j). We assigned to this rule 
the name of preferential exchange. In some respect this 
choice is reminiscent of the idea of preferential attach- 
ment in the formation of growing scale-free networks [l8j | , 
even if, as we discuss in the following, the physical mean- 
ing is rather different. More generally, the preferential 
exchange is also close in spirit to the Tit-for-Tat reci- 
procity strategy believed to be an important ingredient 
to explain the emergence of cooperation and altruism be- 
tween individuals [l9l| . 

FigEl shows the results of simulations with 5*0 = 2, 
followed for 1998 time-steps, to a final size of S* = 2000 
individuals; at every time-step each individual sends out 
Mout = 100 e-mails. As a starting condition we assume 
that every new individual has already exchanged one e- 
mail with everybody else: thus, the structure of the e- 
mail network is trivial, being fully connected. The e-mail 
distributions of random individuals in the population are 
very similar to each other and all exhibit the same alge- 
braic behavior N{k) ^ k~^ with an exponent 7 ^ 1.8(2). 
Noticeably the result does not depend on the choice of 



the above parameters. 

The solution of the model can be obtained also ana- 
lytically, by means of a few approximations that allow 
for the identification of the parameters relevant for the 
model. Indeed, it is possible to write a rate equation for 



dk{i j) 
dt 



Mr, 



■pU 



(2) 



We assume that an individual receives e-mails at a con- 
stant rate so that the denominator in the r.h.s. of 
(0) grows linearly in time: Mm ■ t. We have verified this 
linear dependence on time in our simulations, finding fur- 
thermore that Min ~ Mout- Moreover, we assume that 
there is reciprocity in the e-mail exchange, that is, the 
number of e-mails that i ever sent to j is proportional 
to the number of e-mails that j ever sent to i. This 
allows us to replace the numerator of the r.h.s. of Q 
using k{i — > j) = R ■ k{j i). We have verified also 
this proportionality in our simulations, finding i? ~ 1, 
an indication of the so-called fair reciprocity. With these 
assumptions the rate equation simplifies to 

dk{j^i,t) k{j^i,t) 

; — OL (3) 

dt t ^ ' 

with a = R{MoutlMin). The solution of Eq.© is 

k{J^^,t)^(£^\ (4) 

If ti (tj) is the time at which individual i (j) entered 
the society, we set ta — max{ti, tj) (and of course to < t). 
If j is younger than i then to = tj and we can invert the 
solution to obtain 



tj^t[kU-.i)Y 



(5) 



Eq.© sets a one-to-one relation between tj and k{j i) 
that allows to use the probability conservation relation 
N(k)dk — p{t)dt, where p{t) — const because new indi- 
viduals are added at a constant rate. Therefore we have 
that N{k) ~ fc-T with 7 = 1 (Min/Mout)/ R- If on the 
contrary j is older than i, then to = ti and these fold- 
ers should contribute to a peak of N{k) at k = (t/ti)" 
independent of j. We do not observe this peak in our 
simulations: if we split the histogram of individual i into 
the two contributions of people older and younger than i, 
we find that they show the same algebraic behavior (data 
not shown). This is due to the mean-field nature of the 
above calculations. Fluctuations therefore have been ne- 
glected. This does not apply in the real situation where 
they are enhanced by the positive feedback mechanism. 
As a consequence their combined effect drives the sys- 
tem to the same distribution N{k) for individuals both 
younger and older than i. In the case of perfect reci- 
procity (i? = 1) and if people reply to every e-mail they 
receive {Min/Mout = 1), then the value of the exponent 
7 = 2, close to the results from our simulations. 
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Actually, some of the approximations that we made 
can be safely relaxed. In particular we might assume 
that, depending on the personality, some people have a 
tendency to write slightly more e-mails than they receive, 
i.e. Mout/M^in > 1, or vice-versa (although very large or 
very small values are unreasonable and we expect real 
values to be close to 1); also, reciprocity could be imper- 
fect, always for personality reasons, and R 1 (but again 
very large or small values are unreasonable; this has been 
again verified in our simulations). In these cases we can 
expect variations of the exponent 7, (although nothing 
forbids large variations of this exponent, our expectations 
are that the exponents should always be close to 2, as the 
data in Fig^show). Changing the values of S, So, and 
Mout does not change the results in our simulation. 

Our model, based on the preferential exchange ingre- 
dient, reproduces rather nicely the behavior of the data 
for a large range of parameter values. As previously ob- 
served, this mechanism is similar to the preferential at- 
tachment model proposed by Barabasi and Albert to 
explain the emergence of the scale-free topology of some 
networks. The mathematical similarity extends also to 
some other results: if, for example, the preferential ex- 
change rate equation is modified so that the numerator in 
the r.h.s. of (O becomes k{i j)", then the e-mail dis- 
tribution becomes a stretched exponential, as it happens 
in the context of network topology . 

Nevertheless relevant differences between the two rules 
appear when considering the nature of social networks. 
Firstly, preferential exchange works on a local basis, 
which means that two people can increase the strength of 
their link ignoring what is happening to the other links. 
Instead, in the preferential attachment model the new- 
comers need a full knowledge of the network degrees in 
order to decide their connectivity. Secondly, and more 
importantly, the rate of change of the e-mails that indi- 
vidual i receives from j depends only on the number of 
e-mails that traveled in the opposite direction and on the 
total number of messages that j ever received (both local 
quantities available to the two people). Therefore pref- 
erential exchange is intrinsically symmetric, while prefer- 
ential attachment divides the topology of the network in 
hubs and poorly connected nodes. In summary, this is a 
symmetrically cooperative model where no global infor- 
mation is necessary. 

Interestingly, more data have recently emerged about 
the connection strengths in scientific collaborations net- 
works [2^, airport traffic "g] and other systems, showing 
that the measured strengths arc indeed power-law, or at 
least-fat tail, distributed. Networks often evolve through 



relations that get stronger in time thanks to positive feed- 
back, that is, the more an individual (in the social con- 
text) has given to another one, the more the latter is 
likely to give back in return. Moreover, many networks 
also grow in time. Implementing these ingredients in 
a simple model nicely reproduces the qualitative (alge- 
braic) behavior that we observe in real e-mail data. The 
quantitative agreement is obtained when we add good 
reciprocity: the exchange is a "fair" process. We believe 
that these ingredients do indeed shape social and other 
networks, and the e-mail network, as a particular exam- 
ple, is extremely suited to provide us with a wealth of 
data that could be difficult to gather for other networks 
(it has been found recently that in a sample of mailboxes 
at the HP Laboratories the median number of e-mails 
was 2200 indicating that a large amount of data could 
be, in principle, available for analysis |23|). Moreover, we 
still neglected the interplay between the dynamics over 
the network, and the network structure itself, whereas we 
expect, in principle, that the two should co-evolve toward 
some stationary state. 

At the same time we expect that in most real situa- 
tions this model could be refined by introducing a more 
detailed description of the process of interaction. For 
example, a large variability in people attitude could be 
captured by defining a local intrinsic quantity shaping 
the mechanism of link reinforcement. As in the case of 
the preferential attachment mechanism such generaliza- 
tion does not remove the power-law nature of the prob- 
ability distributions involved 2^2A, 25, 26], but rather 
qualifies the kind of critical processes going on in the sys- 
tem. Further work is needed in this direction, and more 
data about the structure of networks and the strength of 
the connections should be made available to develop and 
validate models. 

From a more general point of view, e-mail networks 
on one hand, and simulations on the other, can help 
investigate the large scale consequences of fairness and 
reciprocity: these two ingredients are often deemed as 
determinant in shaping social relations, yet their effects 
are usually studied for small groups of people and short 
times. The use of computers, both as data resources and 
as simulation tools, can easily bring these studies to large 
scales. 
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FIG. 1: Log-log representation of the e-mail distribution in 
live sets of folders (empty circles, full squares, and other sym- 
bols). They are remarkably similar to each other, hinting 
towards some form of universality. Data have been exponen- 
tially binned to reduce noise. The straight line is a power-law 
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FIG. 2: Log-log representation of the e-mail distribution of 
two random individuals of the model, with So = 2, S = 2000 
and Mout = 100. The best power-law fit yields an exponent 
1.8(2) (straight line). 



